negative prompt

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Negative-prompt

AUTOMATIC1111版Stable Diffusion web UIでAUTOMATIC1111が初めて採用した手法

サンプリング時にunconditional条件で空のプロンプトで行うのではなく、ユーザー指定のテキストを使用する

サンプラは次のことを繰り返す

promptを誘導する画像ノイズ除去を行う（promptによるconditional）

例：お髭のおじさん

negative promptを見えるように誘導する画像のノイズを除去

例：お髭

後者から前者への方向に近づけようとする

そういう方向にテンソルを作るのだと思う基素.icon

つまり、ガイドするテンソルの始点を決めるのがnegative prompt

実際にはnegative promptをtext encoderで処理したテンソルになるはず

https://gyazo.com/d1ed44715026af00f791b579425d4db5

https://stable-diffusion-art.com/how-negative-prompt-work/

Automaticの実装がこうなってるのか確認してみる

コードの読み方はとても雑なのであっているか不明

すでに結構大きいので読みづらい

結局こんなことをやっていた

入力されたnegative promptをunconditional conditionに変換する

Samplerにconditionとunconditional conditionを与え、サンプリングを実行する

ゴールとしてこれらの条件を混ぜ込んで一つのテンソルにして計算している

このテンソルを元に画像を更新する

ひたすらループを回す

うーん、それぞれのプロンプトにおいてデノイジング処理が見当たらない基素.icon

サンプラーの実装だからか〜

後者から前者への方向に近づけようとする処理の部分を読んだわけだ

get_learned_conditioning()を読まないと1, 2ステップはわからない

negative promptがunconditional conditionに変換される

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/433b3ab7017556a19173a86d1215ed0a0b5b1396/modules/processing.py#L642

negative promptをget_learned_conditioning()で変換する

code:py

def get_learned_conditioning(model, prompts, steps):

"""converts a list of prompts into a list of prompt schedules - each schedule is a list of ScheduledPromptConditioning, specifying the comdition (cond),

and the sampling step at which this condition is to be replaced by the next one.

Input:

(model, ['a red crown', 'a blue:green:5 jeweled crown'], 20)

Output:

[

ScheduledPromptConditioning(end_at_step=20, cond=tensor(-0.3886, 0.0229, -0.0523, ..., -0.4901, -0.3066, 0.0674], ..., [ 0.3317, -0.5102, -0.4066, ..., 0.4119, -0.7647, -1.0160, device='cuda:0'))

[

ScheduledPromptConditioning(end_at_step=5, cond=tensor(-0.3886, 0.0229, -0.0522, ..., -0.4901, -0.3067, 0.0673], ..., [-0.0192, 0.3867, -0.4644, ..., 0.1135, -0.3696, -0.4625, device='cuda:0')),

ScheduledPromptConditioning(end_at_step=20, cond=tensor(-0.3886, 0.0229, -0.0522, ..., -0.4901, -0.3067, 0.0673], ..., [-0.7352, -0.4356, -0.7888, ..., 0.6994, -0.4312, -1.2593, device='cuda:0'))

]

"""

このテンソルをcondition（promptから作ったテンソル）と同じ次元にする

code:py

# for DDIM, shapes must match, we can't just process cond and uncond independently;

# filling unconditional_conditioning with repeats of the last vector to match length is

# not 100% correct but should work well enough

if unconditional_conditioning.shape1 < cond.shape1:

last_vector = unconditional_conditioning:, -1:

last_vector_repeated = last_vector.repeat([1, cond.shape1 - unconditional_conditioning.shape1, 1])

unconditional_conditioning = torch.hstack(unconditional_conditioning, last_vector_repeated)

elif unconditional_conditioning.shape1 > cond.shape1:

unconditional_conditioning = unconditional_conditioning[:, :cond.shape1]

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/8a34671fe91e142bce9e5556cca2258b3be9dd6e/modules/sd_samplers_compvis.py#L88-L97

Smaplerにconditionとunconditional conditionsを与える

実際にはこのconditionの形式はScheduledPromptConditioningで、テンソル

Samplerの指定は色々あるようだが、CompVisの実装を呼ぶ場合ここで選択される

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/8a34671fe91e142bce9e5556cca2258b3be9dd6e/modules/sd_samplers_compvis.py#L14

sampleメソッドでlaunch_samplingをする

code:py

samples_ddim = self.launch_sampling(steps, lambda: self.sampler.sample(S=steps, conditioning=conditioning, batch_size=int(x.shape0), shape=x0.shape, verbose=False, unconditional_guidance_scale=p.cfg_scale, unconditional_conditioning=unconditional_conditioning, x_T=x, eta=self.eta)0)

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/8a34671fe91e142bce9e5556cca2258b3be9dd6e/modules/sd_samplers_compvis.py#L218

ここではDDIMの実装に与えたとして読んでいく

実態としてはこのあたりが呼ばれるようだ

https://github.com/CompVis/latent-diffusion/blob/main/ldm/models/diffusion/ddim.py

bing.icon

When you use the import statement in Python, the interpreter searches for the specified module in a list of directories specified by the sys.path variable. This variable is initialized from the PYTHONPATH environment variable and some default locations such as the current working directory and the standard library directories.

If the module is found, it is loaded and made available for use in the current script. If it is not found, an ImportError is raised.

For example, when you run import ldm.models.diffusion.ddim, Python will search for a file named ddim.py in a directory named ldm/models/diffusion within one of the directories specified in sys.path. If it is found, the code within that file will be executed and any objects defined within it will be made available for use in your script.

手元にリポジトリを持ってきてinstallしてもddim.pyは見当たらない基素.icon

sampleメソッドの引数が一致する

https://github.com/CompVis/latent-diffusion/blob/66df437e52826a5149a1c20dcc9f0be0abd0f685/ldm/models/diffusion/ddim.py#L56

ddim_sampling()が繰り返しp_sample_ddim()をよぶ。ここでモデルの適用が1ステップ行われる

p_sampl_ddimの中身

sampler（DDIM）は、unconditional conditionがある場合、image, uc + cond, ts（time step？）を与えてモデルを適用する

適用した結果のe_t_uncond, e_tを使ってe_tを更新する

e_tの意味がわからない基素.icon

promptとnegative promptを加味したテンソル？

e_t = e_t_uncond + unconditional_guidance_scale * (e_t - e_t_uncond)

この式は、e_tを∇cond, e_t_uncondを∇uncond , unconditional_guidance_scaleをCFGnormと読みかえると

(1- CFGnorm)∇uncond + CFGnorm∇condになり、classifier-free guidance#6427feac774b170000f2ad53の式と一致する

最終的に2つを返す

https://github.com/CompVis/latent-diffusion/blob/66df437e52826a5149a1c20dcc9f0be0abd0f685/ldm/models/diffusion/ddim.py#L203

code:py

return x_prev, pred_x0

1. pred_x0 = (x - sqrt_one_minus_at * e_t) / a_t.sqrt()

xは元画像、sqrt_one_minus_atは4次元のテンソル

atはalpha_tのようだ

係数を無視すると、e_t→元画像のテンソルで、e_tは最終的なゴールのような概念だと思うからゴールから元画像方向のテンソル？

元画像方向からゴールのテンソルにならないとへんじゃない？

多分このx_0方向の推定値

https://gyazo.com/85989a82a3b2b6b120ade5e485e0415c

Denoising Diffusion Probabilistic Models

DDIMはDDPMそのままの式ではないからこの図のとおりではないらしい

陰解法が使われているらしい

2. このstepで生成された画像x_prev

x_prev = a_prev.sqrt() * pred_x0 + dir_xt + noise

文字通り現時点のxから見た時の一つ前の画像の予測値に見える

aはalpha。dir_xt（x_t方向）

サンプリングの結果のstep更新では2番目（pred_x0）を使う

code:py

def after_sample(self, x, ts, cond, uncond, res):

if not self.is_unipc:

self.update_step(res1) # res1 = pred_x0

return x, ts, cond, uncond, res

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/8a34671fe91e142bce9e5556cca2258b3be9dd6e/modules/sd_samplers_compvis.py#L127-L131

ループ中の一つ前の結果を使ってループを回す

https://github.com/CompVis/latent-diffusion/blob/66df437e52826a5149a1c20dcc9f0be0abd0f685/ldm/models/diffusion/ddim.py#L148

pred_x0は変数として撮ってはいるが利用していない（生成時に利用している）